Flexible Parameter Tying for Conversational Speech Recognition

نویسندگان

  • Hua Yu
  • Alex Waibel
چکیده

Modeling pronunciation variation is key for recognizing conversational speech. Previous efforts on pronunciation modeling by modifying dictionaries only yielded marginal improvement. Due to complex interaction between dictionaries and acoustic models, we believe a pronunciation modeling scheme is plausible only when closely coupled with the underlying acoustic model. This paper explores the use of flexible parameter tying for pronunciation modeling. In particular, two new techniques are investigated: Gaussian tying and flexible tree clustering. We report a 1.3% absolute WER improvement over the traditional modeling framework on the Switchboard task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parameter tying for flexible speech recognition

This paper presents two parameter tying techniques which enable a trade-off between computational cost and recognition performances of a speaker independent flexible speech recognition system working over the telephone network. Parameter tying is conducted at phonetic and acoustic levels. At the phonetic level, allophone and triphone based phonetic modeling are used simultaneously to achieve th...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Enhanced tree clustering with single pronunciation dictionary for conversational speech recognition

Modeling pronunciation variation is key for recognizing conversational speech. Rather than being limited to dictionary modeling, we argue that triphone clustering is an integral part of pronunciation modeling. We propose a new approach called enhanced tree clustering. This approach, in contrast to traditional decision tree based state tying, allows parameter sharing across phonemes. We show tha...

متن کامل

Recognizing Sloppy Speech

As speech recognition moves from labs into the real world, the sloppy speech problem emerges as a major challenge. Sloppy speech, or conversational speech, refers to the speaking style people typically use in daily conversations. The recognition error rate for sloppy speech has been found to double that of read speech in many circumstances. Previous work on sloppy speech has focused on modeling...

متن کامل

Tree-structured models of parameter dependence for rapid adaptation in large vocabulary conversational speech recognition

Two models of statistical dependence between acoustic model parameters of a large vocabulary conversational speech recognition (LVCSR) system are investigated for the purpose of rapid speakerand environment-adaptation from a very small amount of speech: (i) a Gaussian multiscale process governed by a stochastic linear dynamical system on a tree, and (ii) a simple hierarchical treestructured pri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003